92 research outputs found

    Algebraic properties of structured context-free languages: old approaches and novel developments

    Full text link
    The historical research line on the algebraic properties of structured CF languages initiated by McNaughton's Parenthesis Languages has recently attracted much renewed interest with the Balanced Languages, the Visibly Pushdown Automata languages (VPDA), the Synchronized Languages, and the Height-deterministic ones. Such families preserve to a varying degree the basic algebraic properties of Regular languages: boolean closure, closure under reversal, under concatenation, and Kleene star. We prove that the VPDA family is strictly contained within the Floyd Grammars (FG) family historically known as operator precedence. Languages over the same precedence matrix are known to be closed under boolean operations, and are recognized by a machine whose pop or push operations on the stack are purely determined by terminal letters. We characterize VPDA's as the subclass of FG having a peculiarly structured set of precedence relations, and balanced grammars as a further restricted case. The non-counting invariance property of FG has a direct implication for VPDA too.Comment: Extended version of paper presented at WORDS2009, Salerno,Italy, September 200

    Higher-Order Operator Precedence Languages

    Get PDF
    Floyd's Operator Precedence (OP) languages are a deterministic context-free family having many desirable properties. They are locally and parallely parsable, and languages having a compatible structure are closed under Boolean operations, concatenation and star; they properly include the family of Visibly Pushdown (or Input Driven) languages. OP languages are based on three relations between any two consecutive terminal symbols, which assign syntax structure to words. We extend such relations to k-tuples of consecutive terminal symbols, by using the model of strictly locally testable regular languages of order k at least 3. The new corresponding class of Higher-order Operator Precedence languages (HOP) properly includes the OP languages, and it is still included in the deterministic (also in reverse) context free family. We prove Boolean closure for each subfamily of structurally compatible HOP languages. In each subfamily, the top language is called max-language. We show that such languages are defined by a simple cancellation rule and we prove several properties, in particular that max-languages make an infinite hierarchy ordered by parameter k. HOP languages are a candidate for replacing OP languages in the various applications where they have have been successful though sometimes too restrictive.Comment: In Proceedings AFL 2017, arXiv:1708.0622

    Commutative Languages and their Composition by Consensual Methods

    Get PDF
    Commutative languages with the semilinear property (SLIP) can be naturally recognized by real-time NLOG-SPACE multi-counter machines. We show that unions and concatenations of such languages can be similarly recognized, relying on -- and further developing, our recent results on the family of consensually regular (CREG) languages. A CREG language is defined by a regular language on the alphabet that includes the terminal alphabet and its marked copy. New conditions, for ensuring that the union or concatenation of CREG languages is closed, are presented and applied to the commutative SLIP languages. The paper contributes to the knowledge of the CREG family, and introduces novel techniques for language composition, based on arithmetic congruences that act as language signatures. Open problems are listed.Comment: In Proceedings AFL 2014, arXiv:1405.527

    Aperiodicity, Star-freeness, and First-order Definability of Structured Context-Free Languages

    Full text link
    A classic result in formal language theory is the equivalence among noncounting, or aperiodic, regular languages, and languages defined through star-free regular expressions, or first-order logic. Together with first-order completeness of linear temporal logic these results constitute a theoretical foundation for model-checking algorithms. Extending these results to structured subclasses of context-free languages, such as tree-languages did not work as smoothly: for instance W. Thomas showed that there are star-free tree languages that are counting. We show, instead, that investigating the same properties within the family of operator precedence languages leads to equivalences that perfectly match those on regular languages. The study of this old family of context-free languages has been recently resumed to enhance not only parsing (the original motivation of its inventor R. Floyd) but also to exploit their algebraic and logic properties. We have been able to reproduce the classic results of regular languages for this much larger class by going back to string languages rather than tree languages. Since operator precedence languages strictly include other classes of structured languages such as visibly pushdown languages, the same results given in this paper hold as trivial corollary for that family too

    Algebraic properties of operator precedence languages

    Get PDF
    This paper presents new results on the algebraic ordering properties of operator precedence grammars and languages. This work was motivated by, and applied to, the mechanical acquisition or inference of operator precedence grammars. A new normal form of operator precedence grammars called homogeneous is defined. An algorithm is given to construct a grammar, called max-grammar, generating the largest language which is compatible with a given precedence matrix. Then the class of free grammars is introduced as a special subclass of operator precedence grammars. It is shown that operator precedence languages corresponding to a given precedence matrix form a Boolean algebra

    Toward a theory of input-driven locally parsable languages

    Get PDF
    If a context-free language enjoys the local parsability property then, no matter how the source string is segmented, each segment can be parsed independently, and an efficient parallel parsing algorithm becomes possible. The new class of locally chain parsable languages (LCPLs), included in the deterministic context-free language family, is here defined by means of the chain-driven automaton and characterized by decidable properties of grammar derivations. Such automaton decides whether to reduce or not a substring in a way purely driven by the terminal characters, thus extending the well-known concept of input-driven (ID) alias visibly pushdown machines. The LCPL family extends and improves the practically relevant Floyd's operator-precedence (OP) languages which are known to strictly include the ID languages, and for which a parallel-parser generator exists

    Parallel parsing made practical

    Get PDF
    The property of local parsability allows to parse inputs through inspecting only a bounded-length string around the current token. This in turn enables the construction of a scalable, data-parallel parsing algorithm, which is presented in this work. Such an algorithm is easily amenable to be automatically generated via a parser generator tool, which was realized, and is also presented in the following. Furthermore, to complete the framework of a parallel input analysis, a parallel scanner can also combined with the parser. To prove the practicality of a parallel lexing and parsing approach, we report the results of the adaptation of JSON and Lua to a form fit for parallel parsing (i.e. an operator-precedence grammar) through simple grammar changes and scanning transformations. The approach is validated with performance figures from both high performance and embedded multicore platforms, obtained analyzing real-world inputs as a test-bench. The results show that our approach matches or dominates the performances of production-grade LR parsers in sequential execution, and achieves significant speedups and good scaling on multi-core machines. The work is concluded by a broad and critical survey of the past work on parallel parsing and future directions on the integration with semantic analysis and incremental parsing
    • …
    corecore